Multilingual Robust Anaphora Resolution
نویسندگان
چکیده
Most traditional approaches to anaphora resolution rely heavily on linguistic and domain knowledge. One of the disadvantages of developing a knowledgebased system, however, is that it is a very labourintensive and time-consuming task. This paper presents a robust, knowledge-poor approach to resolving pronouns in technical manuals. This approach is a modification of the practical approach (Mitkov 1998a) and operates on texts pre-processed by a partof-speech tagger. Input is checked against agreement and a number of antecedent indicators. Candidates are assigned scores by each indicator and the candidate with the highest aggregate score is returned as the antecedent. We propose this approach as a platform for multilingual pronoun resolution. The robust approach was initially developed and tested for English, but we have also adapted and tested it for Polish and Arabic. For both languages, we found that adaptation required minimum modification and that further, even if used unmodified, the approach delivers acceptable success rates. Preliminary evaluation reports high success rates in the range of and over 90%
منابع مشابه
Pronominal Anaphora Resolution in the KANTOO Multilingual Machine Translation System
We present an approach to pronominal anaphora resolution using KANT Controlled Language and the KANTOO multilingual MT system. Our algorithm is based on a robust, syntax-based approach that applies a set of restrictions and preferences to select the correct antecedent. We report a success rate of 93.3% on a training corpus with 286 anaphors, and 88.8% on held-out data with 144 anaphors. Our app...
متن کاملBilingual Pronoun Resolution: Experiments in English and French
Anaphora resolution has been a subject of research in computational linguistics for more than 25 years. The interest it aroused was due to the importance that anaphoric phenomena play in the coherence and cohesiveness of natural language. A deep understanding of a text is impossible without knowledge about how individual concepts relate to each other; a shallow understanding of a text is often ...
متن کاملBART: A Multilingual Anaphora Resolution System
BART (Versley et al., 2008) is a highly modular toolkit for coreference resolution that supports state-of-the-art statistical approaches and enables efficient feature engineering. For the SemEval task 1 on Coreference Resolution, BART runs have been submitted for German, English, and Italian. BART relies on a maximum entropy-based classifier for pairs of mentions. A novel entitymention approach...
متن کاملQuestion Answering with Joost at CLEF 2007
We describe our system for the monolingual Dutch and multilingual English to Dutch QA tasks. First, we present a brief overview of our QA-system, which makes heavy use of syntactic information. Next, we describe the modules that were developed especially for CLEF 2007, i.e. preprocessing of Wikipedia, inclusion of query expansion in IR, anaphora resolution in follow-up questions, and a question...
متن کاملAQA: a multilingual Anaphora annotation scheme for Question Answering
This paper presents AQA, a multilingual anaphora annotation scheme that can be applied in Machine Learning for the improvement of Question Answering systems. It has been used to annotate the collection of CLEF 2008 in Spanish, Italian and English. AQA is inspired by the MATE meta-model, which has been adjusted to our needs. By using AQA we specify the relationshiop between the anaphora and its ...
متن کامل